Skip to content

DNM Checking CI job result after enabling disk IOPS/RW limitation#2961

Closed
danpawlik wants to merge 2 commits into
openstack-k8s-operators:mainfrom
danpawlik:check-iops-limitation
Closed

DNM Checking CI job result after enabling disk IOPS/RW limitation#2961
danpawlik wants to merge 2 commits into
openstack-k8s-operators:mainfrom
danpawlik:check-iops-limitation

Conversation

@danpawlik

@danpawlik danpawlik commented May 8, 2025

Copy link
Copy Markdown
Contributor

The CI still have from time to time issue related to "noisy neighbor". We still can ask infra team to apply limitation inside the flavor, but until we don't know what quota can be set, let's do it via systemd. For many services and for CRC (kubelet has set cgroup systemd in /etc/kubernetes/kubelet.conf), so should respect that.

More info what value were set are in commit message and PR [1].

[1] https://review.rdoproject.org/r/c/config/+/57582/

Depends-On: openstack-k8s-operators/openstack-operator#1434

@openshift-ci

openshift-ci Bot commented May 8, 2025

Copy link
Copy Markdown
Contributor

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@openshift-ci

openshift-ci Bot commented May 8, 2025

Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@danpawlik

Copy link
Copy Markdown
Contributor Author

recheck

@softwarefactory-project-zuul

Copy link
Copy Markdown

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/4dd7e96f32d9444684ac5ec2fd9b7778

✔️ openstack-k8s-operators-content-provider SUCCESS in 1h 56m 17s
✔️ podified-multinode-edpm-deployment-crc SUCCESS in 1h 09m 31s
✔️ cifmw-crc-podified-edpm-baremetal SUCCESS in 1h 36m 13s
✔️ podified-multinode-hci-deployment-crc SUCCESS in 1h 39m 41s
✔️ cifmw-multinode-tempest SUCCESS in 1h 33m 07s
✔️ noop SUCCESS in 0s
✔️ cifmw-pod-ansible-test SUCCESS in 8m 15s
✔️ cifmw-pod-pre-commit SUCCESS in 8m 00s
✔️ cifmw-pod-zuul-files SUCCESS in 4m 42s
cifmw-multinode-kuttl FAILURE in 2h 24m 25s
✔️ ci-framework-openstack-meta-content-provider SUCCESS in 16m 05s
✔️ build-push-container-cifmw-client SUCCESS in 20m 58s

@danpawlik

Copy link
Copy Markdown
Contributor Author

recheck

@softwarefactory-project-zuul

Copy link
Copy Markdown

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/4b2b7e9d11314a3ea6242b61367a51df

openstack-k8s-operators-content-provider FAILURE in 7m 21s
⚠️ podified-multinode-edpm-deployment-crc SKIPPED Skipped due to failed job openstack-k8s-operators-content-provider
⚠️ cifmw-crc-podified-edpm-baremetal SKIPPED Skipped due to failed job openstack-k8s-operators-content-provider
⚠️ podified-multinode-hci-deployment-crc SKIPPED Skipped due to failed job openstack-k8s-operators-content-provider
⚠️ cifmw-multinode-tempest SKIPPED Skipped due to failed job openstack-k8s-operators-content-provider
✔️ noop SUCCESS in 0s
✔️ cifmw-pod-ansible-test SUCCESS in 8m 05s
✔️ cifmw-pod-pre-commit SUCCESS in 7m 54s
✔️ cifmw-pod-zuul-files SUCCESS in 4m 50s
cifmw-multinode-kuttl FAILURE in 2h 23m 36s
ci-framework-openstack-meta-content-provider FAILURE in 8m 29s
✔️ build-push-container-cifmw-client SUCCESS in 21m 00s

@danpawlik

Copy link
Copy Markdown
Contributor Author

recheck

@softwarefactory-project-zuul

Copy link
Copy Markdown

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/eb552d8ff1bd47cba9b68e96e530c51c

✔️ openstack-k8s-operators-content-provider SUCCESS in 1h 53m 43s
✔️ podified-multinode-edpm-deployment-crc SUCCESS in 1h 11m 12s
✔️ cifmw-crc-podified-edpm-baremetal SUCCESS in 1h 39m 20s
✔️ podified-multinode-hci-deployment-crc SUCCESS in 1h 32m 54s
✔️ cifmw-multinode-tempest SUCCESS in 1h 29m 45s
✔️ noop SUCCESS in 0s
✔️ cifmw-pod-ansible-test SUCCESS in 9m 02s
✔️ cifmw-pod-pre-commit SUCCESS in 8m 21s
✔️ cifmw-pod-zuul-files SUCCESS in 4m 55s
cifmw-multinode-kuttl FAILURE in 2h 01m 10s
✔️ ci-framework-openstack-meta-content-provider SUCCESS in 14m 53s
✔️ build-push-container-cifmw-client SUCCESS in 17m 45s

@danpawlik

Copy link
Copy Markdown
Contributor Author

recheck

@softwarefactory-project-zuul

Copy link
Copy Markdown

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/dfedc4772b52442babf1e25b2d542ba6

✔️ openstack-k8s-operators-content-provider SUCCESS in 1h 54m 41s
✔️ podified-multinode-edpm-deployment-crc SUCCESS in 1h 11m 50s
✔️ cifmw-crc-podified-edpm-baremetal SUCCESS in 1h 36m 45s
✔️ podified-multinode-hci-deployment-crc SUCCESS in 1h 38m 28s
✔️ cifmw-multinode-tempest SUCCESS in 1h 30m 16s
✔️ noop SUCCESS in 0s
✔️ cifmw-pod-ansible-test SUCCESS in 8m 07s
✔️ cifmw-pod-pre-commit SUCCESS in 8m 02s
✔️ cifmw-pod-zuul-files SUCCESS in 4m 50s
cifmw-multinode-kuttl FAILURE in 2h 26m 08s
✔️ ci-framework-openstack-meta-content-provider SUCCESS in 16m 26s
✔️ build-push-container-cifmw-client SUCCESS in 21m 27s

@danpawlik danpawlik force-pushed the check-iops-limitation branch from aeded39 to 3ac70f4 Compare May 9, 2025 08:24
@danpawlik

Copy link
Copy Markdown
Contributor Author

recheck

1 similar comment
@danpawlik

Copy link
Copy Markdown
Contributor Author

recheck

@softwarefactory-project-zuul

Copy link
Copy Markdown

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/72e20fc6686e48509120326f8b730559

✔️ openstack-k8s-operators-content-provider SUCCESS in 1h 50m 10s
✔️ podified-multinode-edpm-deployment-crc SUCCESS in 1h 11m 35s
✔️ cifmw-crc-podified-edpm-baremetal SUCCESS in 1h 32m 15s
✔️ podified-multinode-hci-deployment-crc SUCCESS in 1h 33m 52s
✔️ cifmw-multinode-tempest SUCCESS in 1h 35m 51s
✔️ noop SUCCESS in 0s
✔️ cifmw-pod-ansible-test SUCCESS in 8m 26s
✔️ cifmw-pod-pre-commit SUCCESS in 8m 50s
✔️ cifmw-pod-zuul-files SUCCESS in 5m 01s
cifmw-multinode-kuttl TIMED_OUT in 2h 40m 58s
✔️ ci-framework-openstack-meta-content-provider SUCCESS in 47m 48s
✔️ build-push-container-cifmw-client SUCCESS in 22m 13s

@danpawlik

Copy link
Copy Markdown
Contributor Author

recheck

@danpawlik

Copy link
Copy Markdown
Contributor Author

Increased IOPS and RW - https://review.rdoproject.org/r/c/config/+/57595 - it is just few minutes to finish. Eh.

@softwarefactory-project-zuul

Copy link
Copy Markdown

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/f509e8861f054906b93c36547388136d

✔️ openstack-k8s-operators-content-provider SUCCESS in 1h 53m 23s
✔️ podified-multinode-edpm-deployment-crc SUCCESS in 1h 12m 24s
✔️ cifmw-crc-podified-edpm-baremetal SUCCESS in 1h 33m 25s
✔️ podified-multinode-hci-deployment-crc SUCCESS in 1h 38m 08s
✔️ cifmw-multinode-tempest SUCCESS in 1h 30m 51s
✔️ noop SUCCESS in 0s
✔️ cifmw-pod-ansible-test SUCCESS in 8m 43s
✔️ cifmw-pod-pre-commit SUCCESS in 8m 50s
✔️ cifmw-pod-zuul-files SUCCESS in 5m 04s
cifmw-multinode-kuttl TIMED_OUT in 2h 41m 30s
✔️ ci-framework-openstack-meta-content-provider SUCCESS in 58m 03s
✔️ build-push-container-cifmw-client SUCCESS in 21m 15s

@danpawlik danpawlik force-pushed the check-iops-limitation branch 2 times, most recently from 2e96cee to a2fb65e Compare May 12, 2025 09:14
@softwarefactory-project-zuul

Copy link
Copy Markdown

Merge Failed.

This change or one of its cross-repo dependencies was unable to be automatically merged with the current state of its repository. Please rebase the change and upload a new patchset.
Warning:
Error merging github.com/openstack-k8s-operators/ci-framework for 2961,a2fb65e01eab327a93a881f9d64e81edddac59e9

The CI still have from time to time issue related to "noisy neighbor".
We still can ask infra team to apply limitation inside the flavor, but
until we don't know what quota can be set, let's do it via systemd.
For many services and for CRC (kubelet has set cgroup systemd in
/etc/kubernetes/kubelet.conf), so should respect that.

More info what value were set are in commit message and PR [1].

[1] https://review.rdoproject.org/r/c/config/+/57582/

Depends-On: openstack-k8s-operators/openstack-operator#1434
@softwarefactory-project-zuul

Copy link
Copy Markdown

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/ca4fb80ccbcf419581e1feef033ebe80

✔️ openstack-k8s-operators-content-provider SUCCESS in 1h 51m 11s
✔️ podified-multinode-edpm-deployment-crc SUCCESS in 1h 10m 28s
✔️ cifmw-crc-podified-edpm-baremetal SUCCESS in 1h 29m 06s
✔️ podified-multinode-hci-deployment-crc SUCCESS in 1h 33m 07s
✔️ cifmw-multinode-tempest SUCCESS in 1h 37m 22s
✔️ noop SUCCESS in 0s
✔️ cifmw-pod-ansible-test SUCCESS in 8m 09s
✔️ cifmw-pod-pre-commit SUCCESS in 8m 38s
✔️ cifmw-pod-zuul-files SUCCESS in 5m 11s
cifmw-multinode-kuttl TIMED_OUT in 2h 40m 30s
✔️ ci-framework-openstack-meta-content-provider SUCCESS in 49m 28s
✔️ build-push-container-cifmw-client SUCCESS in 21m 25s

@danpawlik

Copy link
Copy Markdown
Contributor Author

15k IOPS is not enough.

The kubelet service logs provides many interesting information,
that might be helpful to see what is the root cause of failing job.

Signed-off-by: Daniel Pawlik <dpawlik@redhat.com>
@danpawlik danpawlik force-pushed the check-iops-limitation branch from b0d0e5e to fc44237 Compare May 12, 2025 12:00
@danpawlik

Copy link
Copy Markdown
Contributor Author

recheck

@danpawlik

Copy link
Copy Markdown
Contributor Author

recheck

@danpawlik

danpawlik commented May 12, 2025

Copy link
Copy Markdown
Contributor Author

@danpawlik

Copy link
Copy Markdown
Contributor Author

recheck

rdoproject pushed a commit to rdo-infra/review.rdoproject.org-config that referenced this pull request May 13, 2025
After doing few tests [1], the CI job pass without increasing timeout
when RW (read write) limit is set to 250MB - let's set that value
as default.
Also enable disk limitation by default.

[1] openstack-k8s-operators/ci-framework#2961

Change-Id: I3b2e81711145d398430b3830ed541040123f4535
Signed-off-by: Daniel Pawlik <dpawlik@redhat.com>
@github-actions

Copy link
Copy Markdown

This PR is stale because it has been for over 15 days with no activity.
Remove stale label or comment or this will be closed in 7 days.

@github-actions github-actions Bot added the Stale label May 28, 2025
@github-actions github-actions Bot closed this Jun 4, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants